Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.
translated by 谷歌翻译
3D hand pose estimation from RGB images suffers from the difficulty of obtaining the depth information. Therefore, a great deal of attention has been spent on estimating 3D hand pose from 2D hand joints. In this paper, we leverage the advantage of spatial-temporal Graph Convolutional Neural Networks and propose LG-Hand, a powerful method for 3D hand pose estimation. Our method incorporates both spatial and temporal dependencies into a single process. We argue that kinematic information plays an important role, contributing to the performance of 3D hand pose estimation. We thereby introduce two new objective functions, Angle and Direction loss, to take the hand structure into account. While Angle loss covers locally kinematic information, Direction loss handles globally kinematic one. Our LG-Hand achieves promising results on the First-Person Hand Action Benchmark (FPHAB) dataset. We also perform an ablation study to show the efficacy of the two proposed objective functions.
translated by 谷歌翻译
预测基金绩效对投资者和基金经理都是有益的,但这是一项艰巨的任务。在本文中,我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估,该比例代表了风险调整的绩效,以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率,该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现,经过现代贝叶斯优化训练的长期短期记忆(LSTM)和封闭式复发单元(GRUS)深度学习方法比传统统计量相比,预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法,可以实现所有模型的最佳性能。有证据表明,深度学习和结合能提供有希望的解决方案,以应对基金绩效预测的挑战。
translated by 谷歌翻译
跨核心联合学习利用了几百个可靠的数据筒仓,并具有高速访问链接,共同训练模型。尽管这种方法成为联合学习中的流行环境,但设计出强大的拓扑以减少训练时间仍然是一个开放的问题。在本文中,我们提出了一种用于跨核心联合学习的新的多编码拓扑。我们首先使用覆盖图构造多式图。然后,我们将此多数分析为具有孤立节点的不同简单图。隔离节点的存在使我们能够执行模型聚合而无需等待其他节点,从而减少训练时间。我们进一步提出了一种新的分布式学习算法,以与我们的多编码拓扑一起使用。公共数据集的密集实验表明,与最近的最新拓扑相比,我们提出的方法大大减少了训练时间,同时确保收敛并保持模型的准确性。
translated by 谷歌翻译
深度学习在许多应用中取得了巨大成功。然而,其在实践中的部署已经受到两个问题的困扰:由于通常在地理上分布的大量数据传输,必须集中聚合的数据的隐私。解决这两个问题都是具有挑战性的,并且大多数现有工程无法提供有效的解决方案。在本文中,我们开发FEDPC,是隐私保存和沟通效率的联邦深度学习框架。该框架允许在多个私有数据集中学习模型,同时不显示培训数据的任何信息,即使是中间数据。该框架还可以最大限度地减少更新模型的数据量。我们正式证明培训FEDPC及其隐私保留财产时学习模型的融合。我们对大量实验进行了广泛的实验,以评估FEDPC的性能,以近似到上限的性能(培训集中时)和通信开销。结果表明,当数据分配到10个计算节点时,FEDPC在8.5 \%$ 8.5 \%$ 8.5 \%$ 8.5 \%$ 8.5 \%$ 8.5 \%$ 8.5 \%。与现有工程相比,FEDPC还将通信开销降低至42.20±20美元。
translated by 谷歌翻译
In this paper, we propose a novel technique, namely INVALIDATOR, to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. INVALIDATOR reasons about program semantic via program invariants while it also captures program syntax via language semantic learned from large code corpus using the pre-trained language model. Given a buggy program and the developer-patched program, INVALIDATOR infers likely invariants on both programs. Then, INVALIDATOR determines that a APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains errors behaviors of the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of INVALIDATOR is three-fold. First, INVALIDATOR is able to leverage both semantic and syntactic reasoning to enhance its discriminant capability. Second, INVALIDATOR does not require new test cases to be generated but instead only relies on the current test suite and uses invariant inference to generalize the behaviors of a program. Third, INVALIDATOR is fully automated. We have conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that INVALIDATOR correctly classified 79% overfitting patches, accounting for 23% more overfitting patches being detected by the best baseline. INVALIDATOR also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.
translated by 谷歌翻译
We present a Machine Learning (ML) study case to illustrate the challenges of clinical translation for a real-time AI-empowered echocardiography system with data of ICU patients in LMICs. Such ML case study includes data preparation, curation and labelling from 2D Ultrasound videos of 31 ICU patients in LMICs and model selection, validation and deployment of three thinner neural networks to classify apical four-chamber view. Results of the ML heuristics showed the promising implementation, validation and application of thinner networks to classify 4CV with limited datasets. We conclude this work mentioning the need for (a) datasets to improve diversity of demographics, diseases, and (b) the need of further investigations of thinner models to be run and implemented in low-cost hardware to be clinically translated in the ICU in LMICs. The code and other resources to reproduce this work are available at https://github.com/vital-ultrasound/ai-assisted-echocardiography-for-low-resource-countries.
translated by 谷歌翻译
Ensemble learning combines results from multiple machine learning models in order to provide a better and optimised predictive model with reduced bias, variance and improved predictions. However, in federated learning it is not feasible to apply centralised ensemble learning directly due to privacy concerns. Hence, a mechanism is required to combine results of local models to produce a global model. Most distributed consensus algorithms, such as Byzantine fault tolerance (BFT), do not normally perform well in such applications. This is because, in such methods predictions of some of the peers are disregarded, so a majority of peers can win without even considering other peers' decisions. Additionally, the confidence score of the result of each peer is not normally taken into account, although it is an important feature to consider for ensemble learning. Moreover, the problem of a tie event is often left un-addressed by methods such as BFT. To fill these research gaps, we propose PoSw (Proof of Swarm), a novel distributed consensus algorithm for ensemble learning in a federated setting, which was inspired by particle swarm based algorithms for solving optimisation problems. The proposed algorithm is theoretically proved to always converge in a relatively small number of steps and has mechanisms to resolve tie events while trying to achieve sub-optimum solutions. We experimentally validated the performance of the proposed algorithm using ECG classification as an example application in healthcare, showing that the ensemble learning model outperformed all local models and even the FL-based global model. To the best of our knowledge, the proposed algorithm is the first attempt to make consensus over the output results of distributed models trained using federated learning.
translated by 谷歌翻译
In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices' computing resources compromise the practical effectiveness of PCA. We propose a federated PCA-based Grassmannian optimization framework that coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices' traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and the absence of detecting latency using limited computational resources. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches. Finally, we show that the Grassmann manifold algorithm is highly adapted for IoT anomaly detection, which permits drastically reducing the analysis time of the system. To the best of our knowledge, this is the first federated PCA algorithm for anomaly detection meeting the requirements of IoT networks.
translated by 谷歌翻译
Consider $n$ points independently sampled from a density $p$ of class $\mathcal{C}^2$ on a smooth compact $d$-dimensional sub-manifold $\mathcal{M}$ of $\mathbb{R}^m$, and consider the generator of a random walk visiting these points according to a transition kernel $K$. We study the almost sure uniform convergence of this operator to the diffusive Laplace-Beltrami operator when $n$ tends to infinity. This work extends known results of the past 15 years. In particular, our result does not require the kernel $K$ to be continuous, which covers the cases of walks exploring $k$NN-random and geometric graphs, and convergence rates are given. The distance between the random walk generator and the limiting operator is separated into several terms: a statistical term, related to the law of large numbers, is treated with concentration tools and an approximation term that we control with tools from differential geometry. The convergence of $k$NN Laplacians is detailed.
translated by 谷歌翻译